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Summary  of  Results: 


The  objective  of  this  effort  was  to  extend  the  state  of  the  art  in  Information  Visualization, 
Zoomable  User  Interfaces,  and  closely  related  areas  that  supports  the  Semantic  Web. 
Originally,  the  University  of  Maryland,  College  Park  (UMCP)  was  to  build  general- 
purpose  visualization  tools  that  have  broad  applicability  in  the  semantic  web  community, 
and  subsequently,  they  were  to  build  specific  browsing  tools  to  make  it  easier  to  browse 
complex  visual  semantic  web  search  results. 

However,  the  semantic  web  visualization  tools  required  longer  term  development  than 
initially  anticipated.  Originally  a  three  year  effort,  the  effort  was  discontinued  after  the 
first  year  when  progress  on  the  general-purpose  visualization  tools  indicated  that  the 
available  resources  and  schedule  were  inadequate  for  building  specific  browsing  tools  for 
complex  visual  semantic  web  search  results.  Given  the  available  funding  and  scheduling 
constraints,  UMCP  elected  to  withdraw  from  the  effort  after  completing  the  general- 
purpose  visualization  tool  development  tasks. 

The  results  of  the  first  year  of  effort  in  developing  general-purpose  visualization  tools  for 
taxonomy  visualization  are  found  in  the  appendix  “Taxon  Tree:  Visualizing  Biodiversity 
Information”  (also  available  at:  http://www.cs.umd.edu/local-cgi- 
bin/hcil/ sr.pl?number=HCIL-2003 -40  ).  This  paper  contains  the  basis  upon  which  UMCP 
planned  to  support  networks  and  richer  ontologies. 

In  addition  to  the  findings  in  the  attached  paper,  the  first  year  deliverables  contained 
several  software  development  releases  providing  general-purpose  visualization 
capabilities.  These  general  purpose  tools,  and  some  communities  of  interest  for  the  tools, 
are  found  at  the  references  below: 


Software  Releases: 

1.  Maintenance  release  of  Jazz 

UMCP  released  a  maintenance  release  of  Jazz,  a  Zoomable  User  Interface  platform 
available  at  www.cs.umd.edu/hcil/jazz.  This  is  a  java-based  open  source  platform  used 
world  wide  for  many  projects.  Version  1.3  was  primarily  a  bug  fix  release.  UMCP 
officially  "retired"  Jazz  with  this  release. 

2.  Two  maintenance  releases  of  Piccolo 

2a.  UMCP  released  the  1.0  release  of  Piccolo,  the  successor  to  Jazz,  which  is  a 
lightweight  version  of  Jazz  (in  Java)  that  provides  the  most  commonly  used  features  with 
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a  simpler  API,  and  a  smaller  footprint  intended  for  use  within  applets  on  the  web 
(necessary  for  the  Semantic  Web).  Piccolo  provides  a  general  purpose  toolkit  for 
developing  structured  2D  graphics  applications  in  general,  and  zoomable  user  interfaces 
in  particular,  and  is  available  at  www.cs.umd.edu/hcil/piccolo 

UMCP  collected  public  uses  of  Jazz  and  Piccolo.  Ten 

groups  outside  of  the  University  of  Maryland  are  building  both 
research  and  commercial  applications  using  Jazz  and  Piccolo. 

These  are  listed  at 

www.cs.umd.edu/hcil/piccolo/applications/index.shtml 

UMCP  also  received  interest  from  users  at:  www.gentleware.com,  a  company  which 
released  a  piccolo-based  commercial  UML  diagramming/editing  tool;  and  personnel  at 
the  University  of  Victoria  who  are  using  Piccolo  to  visualize  ontologies  as  a  plug-in  for 
Protege. 

2b.  UMCP  released  version  1.0  beta  of  a  port  of  Piccolo  to  .NET  (with  C#),  so  it  can  be 
distributed  as  part  of  Microsoft’s  .NET  platform.  Piccolo  is  available  as  OSS*  under  the 
Mozilla  Public  License  at  http://www.cs.umd.edu/hcil/piccolo.  This  is  necessary  to 
ensure  the  broadest  general  use  of  Zoomable  User  Interfaces  since  .NET  seems  likely  to 
become  a  serious  competitor  to  Java.  In  addition,  there  is  a  version  of  .NET  for  PDAs 
and  cell  phones  which  is  sufficient  to  support  Piccolo.  The  micro  edition  of  Java,  on  the 
other  hand,  has  minimal  support  for  graphics  which  would  make  supporting  Java-based 
Zoomable  User  Interfaces  on  mobile  devices  very  difficult.  UMCP  will  maintain  the  Java 
and  C#  versions  of  Piccolo. 


Publication: 

•  TaxonTree:  A  paper  on  TaxonTree  -  a  variation  of  UMCP  SpaceTree  program  - 
for  visualizing  biodiversity  taxonomical  data  was  accepted  for  publication  at  the 
conference  on  Advanced  Visual  Interfaces  (AVI).  This  paper  is  available  at: 
www.cs.umd.edu/local-cgi-bin/hcil/sr.pl?number=HCIL-2003-40 


*OSS  -  Open  Source  Software 
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Appendix:  Taxon  Tree:  Visualizing  Biodiversity  Information 

http://www.cs.umd.edu/local-c  gi-bin/hcil/sr.pl?number=HCIL-2003-40 
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TaxonTree:  Visualizing  Biodiversity  Information 

Cynthia  Sims  Parr1,  Bongshin  Lee1,2,  Dana  Campbell1,  and  Benjamin  B.  Bederson1,2 

’Human-Computer  Interaction  Lab 
2Computer  Science  Department, 

Univ.  of  Maryland, 

College  Park,  MD  20742 

{csparr,  bongshin,  bederson} @cs.umd.edu  and  danac@wam.umd.edu 

+  1-301-405-7445 


ABSTRACT 

Biodiversity  databases  have  recently  become  widely 
available  to  the  public  and  to  other  researchers.  To  retrieve 
information  from  these  resources,  users  must  understand  the 
underlying  data  schemas.  Many  other  domains  share  this 
problem. 

We  developed  an  interface,  TaxonTree,  to  visualize  the 
taxonomic  hierarchy  of  animal  names.  We  applied 
integrated  searching  and  browsing  so  that  users  need  not 
have  complete  knowledge  either  of  appropriate  keywords  or 
the  organization  of  the  data. 

Our  qualitative  user  study  of  TaxonTree  in  an 
undergraduate  course  is  the  first  to  describe  usage  patterns 
and  issues  in  the  biodiversity  domain.  Users  indicated  that 
the  system  was  usable  and  tree  visualization  aided  their 
understanding  of  the  data.  Interestingly,  users  with 
different  levels  of  interest  in  the  domain  had  different 
interaction  preferences.  The  study  provides  further 
evidence  for  the  value  of  integrated  searching  and  browsing 
and  interactive  tree  visualization  for  information  retrieval 
and  understanding. 

Author  Keywords 

Biodiversity,  Hierarchy/Tree  Visualization,  Information 
Retrieval,  Searching,  Browsing,  Animation. 

ACM  Classification  Keywords 

H.5.2  [Information  Interfaces  and  Presentation]:  User 
Interfaces  -  Graphical  user  interfaces  (GUI),  Interaction 
styles,  Screen  design.  Evaluation/methodology;  H.3.3 
[Information  Storage  and  retrieval]:  Information  Search  and 
Retrieval  -  Information  filtering,  Search  process;  J.  3  [Life 
and  Medical  Sciences]:  Biology  and  genetics. 

INTRODUCTION 
Biodiversity  Information 

An  understanding  of  the  nature  and  magnitude  of  biological 
diversity  is  fundamental  to  most  pressing  environmental 
and  conservation  debates;  the  budding  field  of  biodiversity 
informatics  is  poised  to  transform  those  debates  [10,  21]. 
Biodiversity  can  be  defined  as  “the  variety  of  life”  [27]. 
Typical  questions  in  this  field  include:  What  kinds  of 
organisms  exist?  What  is  their  genetic  diversity?  How  are 


these  organisms  distributed  geographically?  What 
processes  are  responsible  for  generating,  maintaining,  or 
changing  this  variety?  How  is  this  variety  important  to  us 
and  how  do  we  affect  it? 

Biodiversity  databases  designed  to  answer  these  questions 
are  as  complex  as  molecular  and  medical  biology  resources. 
They  contain  organism-related  information  such  as 
distribution,  taxonomy,  natural  history,  and  conservation 
data.  Perhaps  the  best  developed  are  a  number  of 
taxonomic  databases  for  researchers.  The  Species  2000  [1] 
initiative  seeks  a  comprehensive  index  of  the  world’s 
species.  Museums  provide  online  access  to  their  specimen 
databases,  often  through  gateways  such  as  the  Ocean 
Biogeographic  Information  System  [4],  and  are  currently 
developing  tools  for  knowledge  discovery  and  distributed 
querying  (e.g.  Mammal  Networked  Information  System 

[3]). 

Unlike  molecular  biology  databases,  many  biodiversity 
databases  serve  a  broad  audience.  Wildlife  departments  and 
agricultural  extension  agencies  maintain  state-specific 
websites  on  local  game,  non-game,  and  pest  species.  These 
resources  as  well  as  commercial  and  private  non-profit 
natural  history  databases  are  aimed  at  concerned  citizens 
and  outdoor  enthusiasts. 

Whether  they  present  primary  data  or  encyclopedic 
summaries,  these  resources  share  a  reliance  on  taxonomy, 
or  the  scientific  naming  of  organisms.  Biologists  give 
organisms  scientific  names,  usually  Latin  or  Latin-like,  that 
must  follow  certain  rules  to  be  considered  official  by  the 
scientific  community.  Organisms  thought  to  be  close 
relatives  are  put  into  a  group  together,  and  related  groups 
are  likewise  grouped,  forming  a  nested  hierarchy,  which 
can  be  visualized  as  a  tree.  In  a  Linnaean  classification, 
groupings  receive  a  rank,  such  as  Kingdom,  Phylum,  Class, 
Order,  Family,  Genus,  and  Species  (in  decreasing  order  of 
inclusiveness).  Common  names,  on  the  other  hand,  are 
informal  ways  of  referring  to  organisms.  While  they  are  not 
standardized  (they  differ  according  to  language  and  dialect 
of  the  laypeople  using  them)  common  names  can  be  very 
useful  for  non-experts. 
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Biodiversity  Information  Retrieval 

Current  interfaces  to  online  biodiversity  resources  can  be 
divided  into  three  broad  categories,  none  of  which  take 
advantage  of  current  advances  in  HCI  research  such  as 
dynamic  queries  [6],  query  previews  [16],  and  visualization 
[13].  The  simplest  interfaces  involve  one  or  two  free  text 
query  fields  and  either  pull-down  or  radio  button  lists  of 
categories  to  be  searched.  The  current  Animal  Diversity 
Web  interface  is  an  example  of  such  a  simple  structured 
query  interface  (Figure  1).  This  interface  has  the  advantage 
of  simplicity,  but  may  not  be  used  effectively  by  users  who 
do  not  know  currently  recognized  scientific  or  common 
names. 
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Figure  1.  ADW  Simple  Structured  Query  Interface 

Some  interfaces  rely  heavily  on  controlled  keyword  choices 
and  combine  all  possible  structured  queries  on  the  same 
page.  This  kind  of  interface  is  typical  of  large  conservation 
data  resources,  and  is  no  doubt  driven  by  the  complexity  of 
the  different  query  types  expected  from  users.  While  it 
reduces  the  need  to  know  keywords  for  each  category,  a 
novice  user  or  even  a  professional  may  be  stymied  by  how 
much  or  how  little  of  the  page  may  be  used  for  their  search, 
and  by  the  role  of  free  text  in  the  search. 

The  third  and  most  complex  kind  combines  the  first  two  by 
providing  different  pages  for  different  kinds  of  structured 
queries.  This  makes  it  possible  to  support  many  kinds  of 
queries  without  including  them  all  on  the  same  page,  but  a 
user  must  determine  which  page  has  the  query  form  of 
interest.  It  is  possible  to  build  a  complex  query  across 
query  types,  but  the  user  must  build  queries  across  pages. 

These  interfaces  all  have  problems  for  end-users.  First, 
they  require  that  the  users  have  significant  knowledge  about 
the  underlying  database,  and  the  structure  of  the  website 
search  interface.  Without  such  knowledge,  users  regularly 
get  empty,  or  alternatively,  huge  search  results  -  which  are 


often  meaningless.  Second,  they  often  provide  search 
results  in  the  form  of  a  long  list  (Figure  1).  Such  lists  lack 
biological  context  that  could  help  users  not  only  find 
specific  information  but  understand  the  larger  biodiversity 
patterns  in  the  results.  Our  approach  to  solving  these 
problems  is  to  create  an  interface  that  couples  browsing  and 
searching  of  the  taxonomic  hierarchy  using  modern 
visualization  techniques  in  order  to  help  users  build  an 
understanding  of  the  underlying  data. 

Below  we  review  research  on  visualizing  hierarchies, 
integrating  searching  and  browsing,  and  other  techniques 
that  motivate  our  approach.  We  then  describe  the  tool  we 
designed,  TaxonTree.  Finally,  we  describe  a  user  study  of 
TaxonTree  designed  to  characterize  user  behavior  in  the 
biodiversity  domain. 

RELATED  WORK 
Visualizing  Hierarchies 

Over  the  last  20  years  there  has  been  much  research  on  how 
to  effectively  display  and  interact  with  hierarchies.  One 
approach  is  node-link  diagrams.  Cone  Trees,  a  3D 
hierarchy  visualization,  makes  the  context  easily  visible 
with  a  focus  and  context  display  [24].  However,  it  has 
limits  on  scale  and  suffers  from  occlusions.  Hyperbolic 
Browser  solves  the  occlusion  problems  of  Cone  Trees  by 
using  hyperbolic  space  instead  of  3D  perspective  [19]. 
WebTOC  displays  a  hierarchical  representation  of  the 
documents  local  to  the  site  as  a  Table  of  Contents  [22]. 
Users  can  expand  and  collapse  the  hierarchy.  Links  are 
listed  with  a  bar  that  represents  the  volume  of  information 
available  when  following  that  link.  PadPrints  provides  a 
dynamically  built  hierarchical  history  of  visited  web  pages 
to  aid  web  navigation  [18].  Zooming  enables  users  to 
control  how  much  context  is  viewed.  SpaceTree  combines 
the  node-link  tree  diagram  with  a  zooming  environment  that 
dynamically  lays  out  branches  of  the  tree  to  best  fit  the 
available  screen  space  [23].  Degree  of  Interest  Trees 
compute  a  degree  of  interest  value  for  each  node  in  the 
hierarchy  and  display  an  overview  of  the  complete 
hierarchy  by  only  showing  detail  for  nodes  with  high  DOI 
values  [12]. 

Integrated  Searching  and  Browsing 

A  number  of  projects  have  integrated  searching  and 
browsing.  SuperBook  shows  the  number  of  occurrences  of 
the  search  word  to  the  left  of  section  titles  in  a  table  of 
contents  [15].  However,  users  still  have  to  expand  items 
manually  to  see  how  the  words  are  distributed  in  the  lower 
levels.  AMIT  (Animated  Multiscale  Interactive 
TreeViewer)  integrates  fisheye  treebrowsing  with  search 
and  filtering  for  web  information  access  [28].  It  displays 
only  the  paths  to  the  matching  nodes  and  the  size  of  a  node 
represent  its  relevance  score.  LifeLines  searches  can  be 
performed  on  a  patient  record  and  highlights  all  parts  of  the 
record  that  match  [7].  SpaceTree  supports  dynamic  filters 
and  searches  showing  results  in  the  node  link  diagram  [23]. 
Yee  et  al.  applied  integrated  searching  and  browsing  to  user 
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interfaces  for  large  image  collections  within  their  web- 
based  Flamenco  system  [29].  DateLens  is  a  fisheye 
calendar  that  highlights  the  days  containing  appointments 
that  match  the  search  criteria  [9].  In  addition,  its  attribute- 
mapped  scrollbar  represents  which  days  are  highlighted 
outside  of  the  current  view. 

Animation 

Bederson  and  Boltman  [8]  found  that  animation  improves 
users’  ability  to  reconstruct  the  information  space,  and  does 
not  increase  task  performance  time.  In  applications  where 
the  user  navigates  around  fixed  spatial  data  space,  they 
recommend  brief  animated  transitions.  Others  have  made 
similar  recommendations  [11]. 

SpaceTree 

Because  SpaceTree  [23]  (Figure  2)  combines  hierarchy 
visualization,  integrated  searching  and  browsing,  and 
animation,  we  chose  to  adapt  it  to  the  biodiversity  domain 
as  a  first  step  towards  understanding  the  domain  and  its 
users.  SpaceTree  is  a  tree  browser  that  enables  users  to 
explore  large  trees  by  adding  a  zooming  environment  that 
dynamically  lays  out  branches  to  best  fit  the  conventional 
node-link  diagram  to  the  available  screen  space. 


uuniiji.jju.i  mi  .1"  u'  i  i.iih"‘hww;  jaai 

r*t  CM  Help 

Search:  [  1  [|  i  - 1  I 


Figure  2.  SpaceTree 

Users  can  navigate  the  tree  by  clicking  on  nodes  or  by  using 
the  arrow  keys.  Based  on  user  feedback  that  they  didn’t 
want  to  open  the  tree  one  level  at  a  time  when  there  was 
room  to  open  more  levels  at  once,  SpaceTree  maximizes  the 
number  of  lower  levels  to  be  opened.  When  users  change 
the  focus,  the  tree  is  animated  to  its  new  layout,  which 
makes  full  use  of  screen  space,  in  three  main  steps:  1)  trims 
the  tree  of  the  branches  that  would  overlap  the  new  branch 
to  be  opened;  2)  moves  the  trimmed  tree  so  that  the  new 
tree  layout  will  center  on  the  window,  3)  expands  the 
branch  out  of  the  new  focus  point.  While  animating, 
SpaceTree  retains  landmarks  to  help  users  maintain  their 
orientation.  It  uses  the  current  focus  and  the  path  up  to  the 
root  as  landmarks  and  highlights  the  ancestor  path  of  the 
current  focus.  SpaceTree  provides  icons  to  preview  the 
topology  of  branches  that  cannot  be  fully  opened  because  of 
lack  of  space. 


SpaceTree  also  supports  fdters  and  searches.  As  users  type 
a  string,  SpaceTree  highlights  the  relevant  nodes  within  the 
tree.  Users  can  see  a  fdtered  view  of  the  tree,  displaying 
only  the  paths  to  the  matching  nodes. 

rati i»  »«*) 
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Figure  3.  TaxonTree  lays  out  branches  of  the  classification 
tree  to  best  fit  the  available  screen  space.  Magnified  node 
shows  synapomorphies  (evolutionarily  significant,  diagnostic 
characteristics)  and  “more”  triangle  indicating  further  nodes. 

TAXONTREE 

To  visualize  the  Linnaean  classification  for  taxonomic 
names  in  Kingdom  Animalia,  we  developed  TaxonTree 
(Figure  3)  by  extending  SpaceTree.  TaxonTree1  allows 
users  to  browse  and  search  a  tree  of  about  200,000  animal 
names  that  we  constructed  by  integrating  data  from  a 
number  of  public  and  private  sources  [2,  5,  25,  26]. 
TaxonTree,  like  SpaceTree,  uses  animated  zooming  and 
integrated  searching  and  browsing.  Search  results  are 
presented  in  the  larger  context  of  their  classification  tree  to 
help  users  understand  patterns  in  biodiversity  information. 

Target  Audience  and  Design  Partners 

There  have  been  no  studies  to  describe  usage  patterns  and 
issues  in  the  biodiversity  information  domain.  We  used 
methodologies  adapted  from  collaborative  design  [14]  to 
explore  the  domain  and  develop  an  interface  for  it  in 
concert  with  its  users. 

Our  target  audience  is  students  taking  a  second-year  college 
course  University  of  Maryland  entitled  Introduction  to 
Animal  Diversity.  As  biology  majors,  they  are  becoming 
familiar  with  the  biological  content  but  cannot  be 
considered  experts.  To  incorporate  the  extra  detail  that  the 
course  covered  about  known  evolutionary  relationships,  we 
constructed  a  specialized  tree  of  animal  names  that  further 


1  TaxonTree  is  available  for  download  at 
www.cs.umd.edu/hcil/taxontree. 
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resolved  the  Linnaean  classification  at  higher  levels  [17, 

20], 

We  worked  with  a  five  member  team  of  “design  partners”, 
invited  from  the  Animal  Diversity  course;  seven  women 
participated  as  design  partners  during  this  study. 

Extensions 

While  major  functionalities  are  inherited  from  SpaceTree, 
TaxonTree  expands  on  its  capabilities  in  several  ways. 

Domain-specific  visualization 

We  changed  the  visualization  of  TaxonTree  to  support 
biodiversity  data.  First,  we  displayed  common  names  with 
a  different  font  style  so  that  users  can  distinguish  them 
easily.  Second,  we  provided  links  to  external  web  pages  on 
three  different,  publicly  available  websites  (Animal 
Diversity  Web,  Tree  of  Life,  University  of  California 
Museum  of  Paleontology).  Thus,  TaxonTree  can  be  used  as 
a  portal  to  public  biodiversity  resources. 

Some  features  were  designed  explicitly  for  the  University 
of  Maryland  Animal  Diversity  course,  such  as  visual 
distinction  of  required  course  material  from  non-required 
material,  bookmarks  of  names  for  future  reference,  and 
display  of  biological  ranks  for  each  node  as  tool  tips.  We 
also  added  a  way  to  see  'synapomorphies'  (Figure  3)  - 
attributes  of  nodes  that  show  how  that  node  is  distinguished 
from  its  siblings. 


tr 


Figure  4.  TaxonTree  presents  search  results  in  the  larger 
biological  context  of  their  classification  tree.  All  squids  are 
mollusks  but  there  are  several  subgroups. 

Search 

Users  can  search  both  scientific  and  common  names  in  the 
classification.  They  can  also  search  on  synapomorphies 
and  the  full  text  of  the  Animal  Diversity  Web  showing 
results  within  TaxonTree.  TaxonTree  presents  search 
results  in  the  biological  context  of  their  classification  tree 
and  highlights  them  (Figure  4).  it  helps  users  understand 
patterns  in  the  results,  for  example  animals  with  a  common 
name  including  “dolphin”  appear  in  two  very  different 


branches  of  the  tree.  Since  the  search  result  tree  is  often 
larger  than  screen,  TaxonTree  automatically  zooms  out  to 

fit  it  in. 

Modified  interaction 

As  explained  above,  at  the  first  step  of  animation  when 
users  change  the  focus,  SpaceTree  automatically  closes  the 
branches  that  would  overlap  a  newly  opened  branch.  This 
has  two  major  advantages:  1)  the  screen  is  less  cluttered;  2) 
the  siblings  of  the  focus  node  are  always  adjacent. 
However,  we  found  that  some  of  the  more  sophisticated 
'auto-layout'  features  of  SpaceTree  were  confusing  and 
disorienting  to  our  undergraduate  design  partners  and 
project  members  with  content  experience  and  interest.  We 
thus  introduced  an  optional  simpler  navigation  mode 
whereby  users  can  open  and  close  nodes  manually. 

Scaling  up 

To  show  the  overview  of  the  whole  tree  and  to  provide 
dynamic  filters,  SpaceTree  requires  the  entire  tree  to  be 
loaded  into  memory.  Since  this  is  impractical  for  large 
trees,  we  built  TaxonTree  to  access  the  data  from  a  database. 
In  this  way  we  are  able  to  scale  up  to  support  interaction 
with  very  large  trees  and  multiple  attributes. 

USER  STUDY 

In  May  2003  we  conducted  a  qualitative  study  with  three 
main  goals.  First,  we  wanted  to  characterize  how  users  of 
this  domain  think  about  biodiversity  information  in  general. 
Are  they  more  likely  to  look  for  information  using  scientific 
or  common  names?  What  taxonomic  rank  (species  or 
higher)  are  they  more  likely  to  target?  What  kind  of 
information  are  they  most  interested  in?  Second,  we 
investigated  the  usability  and  interaction  preferences  with 
this  particular  software.  Are  users  comfortable  with 
integrated  searching  and  browsing,  and  with  animation  and 
zooming?  Finally,  we  examined  how  this  kind  of 
information  retrieval  interface  can  assist  information 
understanding  in  this  domain.  Do  students  use  the  tree 
visualization  to  successfully  complete  tasks  that  require 
interpretation  and  understanding  of  the  underlying  data 
structure? 

We  chose  a  qualitative  methodology  because  user  behavior 
in  this  domain  has  never  been  studied.  Also,  the  aim  of 
TaxonTree  is  to  foster  content  understanding  so  standard 
metrics  of  efficiency  are  unlikely  to  be  appropriate. 
Insights  gained  from  this  study  should  guide  both  design 
and  quantitative  assessment  of  future  tools. 

Participants 

We  recruited  18  undergraduate  volunteers  (8  male:  10 
female,  18  to  20  years  old)  from  the  Animal  Diversity 
course  at  University  of  Maryland.  None  of  them  were  part 
of  the  above  described  design  partner  team.  Each 
participant  was  given  ten  dollars  for  his/her  participation. 
We  tested  five  pairs  of  users  and  eight  single  users  for  a 
total  of  13  sessions,  or  user  “teams.”  The  study  occurred  at 
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the  end  of  the  semester  so  participants  were  largely  familiar 
with  the  biological  content  but  could  not  be  considered 
experts.  The  software  had  been  demonstrated  in  lecture  and 
distributed  to  all  students  on  CD-ROM  for  personal  use  two 
weeks  prior  to  the  study.  Users  reported  they  had  used  the 
program  for  an  average  time  of  half  an  hour,  and  eight  out 
of  18  had  not  used  it  at  all. 

Procedure 

Each  session  lasted  30  to  45  minutes.  Each  user  filled  out  a 
survey  to  determine  their  computer  usage  background  and 
amount  of  time  previously  spent  with  TaxonTree.  They 
were  seated  in  front  of  a  2GHz  Windows  XP  laptop  with  an 
ordinary  mouse,  a  1280x1024  pixel  display  and  512MB 
RAM,  placed  on  a  standard  office  desk.  We  videotaped  the 
computer  screen  throughout  the  testing  to  capture  both  the 
actions  and  verbal  comments  of  the  users. 

Users  then  received  a  brief  demonstration  of  TaxonTree 
features.  User  teams  were  asked  to  perform  nine 
information  retrieval  tasks,  described  below.  When  user 
teams  had  obvious  difficulty  completing  the  tasks  we  gave 
them  prompts.  At  the  end  of  the  tasks,  we  asked  open- 
ended  questions  about  what  each  user  liked  and  found 
difficult  about  the  software.  We  asked  what  kind  of 
information  they  generally  would  like  about  animals. 

Tasks 

Two  tasks  were  general  and  open-ended,  without  single 
correct  answers.  We  designed  these  tasks  to  assess  user 
preferences  for  information  targets  and  strategies  to  reach 
them.  We  asked  them  to  use  TaxonTree  to  find  information 
about  an  animal  of  their  choice.  Depending  on  the  strategy 
they  took  in  task  1  (searching  or  browsing),  we  asked  them 
in  task  2  to  choose  another  target  animal  and  use  the  other 
strategy  to  find  it. 

The  other  seven  tasks  were  more  specific  and  had  a  limited 
number  of  correct  answers.  These  tasks  assessed  a  user’s 
ability  to  use  most  of  the  features  of  the  software,  to  further 
examine  their  preferences  for  information-seeking 
strategies,  and  to  examine  the  role  of  the  interface  in 
understanding  the  information. 

These  tasks  were:  3.  Find  an  extinct  taxon.  4.  Count  how 
many  extinct  taxa  you  might  need  to  know  about  for  the 
final  exam.  5.  Find  and  name  the  taxon  whose  members  are 
all  united  with  the  synapomorphy  "Lactation."  6.  What  is 
the  sister  group  to  this  group  of  lactating  animals?  7.  Now 
try  searching  on  the  common  name,  “dolphin.”  What  do 
you  notice  about  the  results?  8.  Find  some  victims  or 
carriers  of  malaria.  9.  What  do  you  notice  about  these 
victims  or  carriers? 

RESULTS 

Characterizing  Users  In  This  Domain 

We  noticed  during  testing  sessions  that  some  users  were 
clearly  interested  in  the  content.  These  users  verbally 
expressed  prior  content  knowledge  as  they  worked  on  tasks, 


or  asked  questions  indicating  curiosity  about  information 
beyond  the  task.  For  example,  a  user  asked  “Why  isn’t 
there  anything  about  mosquitoes?”  when  looking  at  result 
for  a  search  about  malaria.  In  contrast,  some  users  never 
departed  from  the  tasks  at  hand.  Figure  5  illustrates  how 
often  users  offered  extra  information  indicating  content 
interest.  Guided  by  Figure  5,  we  defined  high  interest  users 
as  those  who  spontaneously  offered  extra  content 
information  during  at  least  4  of  the  9  tasks;  the  others  were 
labeled  low  interest.  Consistent  with  this  categorization, 
three  users  labeled  low  interest  in  this  way  had  stated  they 
lacked  interest  in  the  Animal  Diversity  course. 


Number  of  tasks 


Figure  5.  Distribution  of  13  user  teams  based  on  the  number  of 
tasks  in  which  they  spontaneously  offered  additional  biological 
information,  indicating  their  interest  in  the  domain 

Low  interest  and  high  interest  users  reported  similar  hours 
of  experience  with  the  application  and  similar  levels  of 
comfort  with  computing.  Six  of  the  8  males  in  the  study 
were  in  low-interest  user  teams;  two  of  the  ten  females  were 
in  low  interest  teams. 

Users  tended  to  be  interested  in  looking  for  animals  using 
common  names  and  above  the  species  level  e.g.  “frogs.” 
(Order  Anura).  Specifically,  when  asked  to  choose  any 
animal  to  find,  users  gave  20  out  of  26  initial  targets  as 
common  names  rather  than  scientific  names.  Many  targets 
were  clearly  above  the  species  level  (14  of  26  targets)  while 
4  were  ambiguous  and  8  were  species  level.  Fewer  than 
half  of  the  search  targets  (12  of  26  targets)  were  required 
course  content.  Low  and  high  interest  user  teams  had 
similar  search  targets.  When  asked  the  kinds  of  information 
they  were  interested  in,  5  of  13  teams  mentioned  way  of  life 
(food  habits,  behavior,  ecology).  Five  of  13  teams  noted 
that  they  enjoyed  learning  unique  characteristics  of  animals 
-  interesting  superlatives  or  what  sets  an  animal  apart  from 
others.  Four  user  teams  mentioned  that  they  wanted  only 
the  information  necessary  to  pass  their  course.  Two  user 
teams  mentioned  an  interest  in  evolutionary  relationships. 
One  user  wanted  information  to  distinguish  dangerous  from 
harmless  animals. 


8 


Usability 

The  interface  seemed  comfortable  to  users  once  they  knew 
what  features  were  available.  Interaction  with  and 
interpretation  of  the  nodes  was  apparently  intuitive,  because 
even  users  who  had  never  used  the  program  immediately 
began  opening  and  closing  nodes.  Few  users  needed 
prompts  explaining  “more”  triangles  (shown  in  Figure  3), 
panning  or  zooming,  node-clicking,  or  the  ability  to  search. 
About  85%  of  the  prompts  we  gave  related  instead  to  using 
our  specific  search  categories  and  controlling  the  view 
options:  how  much  of  the  tree  was  displayed  (all  nodes  as 
they  are  opened,  just  the  subset  required  for  the  course  as 
they  are  opened,  or  all  required  nodes  at  once).  Low  interest 
users  actually  needed,  on  average,  fewer  prompts  per 
session  (4.3)  to  complete  tasks  compared  to  high  interest 
users  (7.4). 

Searching  and  Browsing 

Most  users  used  both  searching  and  browsing  strategies 
together  in  at  least  one  task.  Only  four  user  teams  always 
used  a  single  strategy  within  each  task;  three  of  these  four 
teams  were  low  interest  users.  Five  of  13  teams  browsed 
the  tree  before  choosing  a  target  or  changed  their  target 
while  browsing. 

Most  users  preferred  browsing  the  tree  over  searching. 
Only  three  of  13  teams  used  searching  as  an  initial  strategy 
-  these  were  all  high  interest  users  who  probably  had  better 
ideas  of  what  search  terms  to  use.  Even  after  a  successful 
search,  10  out  of  11  subject  teams  returned  to  a  strategy  of 
browsing.  When  asked  why,  they  told  us  it  was  more  fun 
than  searching,  that  they  wanted  to  refresh  their  memories, 
and  that  they  didn’t  know  exactly  what  to  search  on. 

Task  Completion 

Users  completed  92%  of  all  tasks  without  prompts  to 
interpret  results  shown  onscreen. 

Some  tasks  that  asked  for  direct  interpretation  of  the  tree 
were  very  easy  for  the  user  teams.  In  task  5,  12  of  13  teams 
needed  no  prompts  to  correctly  associate  an  attribute  (the 
synapomorphy  “lactation”)  with  the  name  of  the 
appropriate  node.  All  but  one  team  successfully  completed 
task  6,  identifying  a  sister  group  from  a  search  result  by 
opening  a  nearby  node.  Only  three  teams  needed  a  prompt. 
Task  9,  what  do  you  notice  about  victims  and  carriers  of 
malaria,  was  readily  answered.  Eleven  of  13  teams  gave  an 
immediate  answer  relating  to  the  tree  structure  (such  as,  the 
search  results  were  in  related  branches  of  the  tree).  Task  4 
asked  “Count  how  many  extinct  taxa  you  might  need  to 
know  about  for  the  final  exam.”  All  but  one  team 
immediately  moved  from  displaying  all  overview  nodes, 
including  16  color-coded  as  extinct,  to  task  completion 
(counting  all  the  nodes  that  were  color  coded  as  extinct). 

However,  some  tasks  were  clearly  harder  than  others.  Task 
7  asked  users  to  draw  inferences  from  a  search  on  the 
common  name  “dolphin.”  We  scored  understanding  success 
if  the  users  gave  at  least  one  of  two  answers.  First,  there 


are  many  kinds  of  organisms  whose  common  name  includes 
the  word  “dolphin.”  Second,  organisms  with  a  common 
name  including  the  word  “dolphin”  appear  in  more  than  one 
very  different  branches  of  the  animal  kingdom.  The  first 
inference  was  immediately  drawn  by  all  but  one  of  the  13 
user  teams.  Such  an  inference  would  be  nearly  impossible 
to  make  quickly  using  a  typical  list  of  search  engine  results. 
The  second,  however,  requires  the  more  sophisticated 
inference  requiring  an  understanding  of  biological 
relationships.  This  inference  was  only  mentioned  by  five  of 
13  user  teams. 

Task  8  asked  users  to  conduct  a  search  for  carriers  or 
victims  of  malaria.  This  task  posed  particular  difficulty 
because  of  its  sensitivity  to  both  the  search  terms  chosen 
and  the  category  of  search  that  was  run.  Three  users  were 
unable  to  complete  the  search  without  more  than  two 
prompts.  These  plus  an  additional  two  user  teams  failed  to 
look  at  the  web  pages  in  the  results  to  be  sure  that  their 
search  terms  were  in  the  appropriate  context.  However,  8  of 
13  teams  did  check  for  relevance. 

Tree  visualization  was  very  effective  in  helping  users 
complete  tasks.  Task  9,  “What  do  you  notice  about  these 
victims  or  carriers?”  could  be  completed  either  by 
interpreting  a  tree  visualization  of  search  results  or  by 
applying  prior  knowledge  to  those  search  results.  For 
example,  a  user  response  such  as  “all  of  these  victims  seem 
to  live  in  forests”  would  be  an  example  of  prior  knowledge, 
while  “all  of  these  victims  are  in  the  vertebrate  part  of  the 
tree”  indicates  use  of  the  tree  to  interpret  the  results. 
Although  three  or  four  user  teams  did  use  prior  knowledge, 
only  one  set  of  users  used  it  as  their  first  answer  to  the 
question.  All  the  others  gave  tree  information  for  their  first 
answer. 

Domain  interest  seemed  correlated  with  domain  expertise 
as  high  interest  users  but  not  low  interest  users  tended  to 
effectively  use  their  prior  knowledge  to  help  solve  tasks. 

User  Comments 

User  responses  to  open  ended  questions  are  summarized  in 
Table  1.  Users  said  that  TaxonTree  was  usable  and  had 
desirable  content  (synapomorphies,  external  web  links, 
course  information).  Several  mentioned  explicitly  that 
TaxonTree’s  visualization  would  be  more  useful  to  them 
than  accessing  the  same  information  in  their  lecture  notes  or 
in  the  textbook. 

Users  had  difficulty  with  unfamiliar  features  (search 
categories  and  view  menu  options).  The  other  negative 
comments  related  to  information  quantity.  Some  users 
noted  the  difficulties  inherent  in  displaying  large  amounts 
of  information  (font  sizes,  zooming  problems).  Some 
wanted  more  refined  search  results,  while  others  felt  that 
merely  having  so  much  information  available  to  browse  or 
search  was  daunting. 

All  but  two  user  teams  offered  spontaneous  positive 
comments  while  completing  tasks.  Visualizing  search 
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results  in  the  tree  structure  elicited  the  most  positive 
comments  (6  user  teams),  along  with  the  availability  of  web 
pages  with  more  information  (4).  Four  user  teams  also 
were  excited  about  the  ability  to  see  an  overview  tree  of  the 
information  necessary  for  their  course. 


What  users  liked 

#  of  user 
teams 

Easy  to  learn  and  use 

9 

Tree  visualization 

9 

Synapomorphies 

7 

Ability  to  search  different  categories 

6 

How  evolutionary  history  is  presented 

4 

How  tree  is  interactive 

3 

Links  to  external  web  sites. 

3 

Seeing  which  content  required  for  their  course. 

3 

What  users  found  difficult 

Search  categories  were  hard  to  understand 

4 

Font  too  small,  especially  when  zoomed  out 

3 

Too  much  information,  too  many  search  results 

3 

Had  problems  zooming 

3 

Had  problems  understanding  view  menu 

3 

After  formal  testing  we  asked  some  user  teams  to  try  the 
auto  node  closing  feature.  Their  comments  were  consistent 
with  the  impressions  we  gained  during  our  design  phase: 
high  interest  user  teams,  like  our  design  partners  and  our 
staff  biologists,  preferred  manually  closing  and  opening 
branches.  Low  interest  users  preferred  auto-closing.  We 
plan  further  tests  to  determine  if  this  difference  in 
interaction  preference  is  really  related  to  content  expertise, 
and  if  so  whether  individuals  transition  from  one  style  to 
the  other  as  they  become  experts. 


MAMMALIA 


Figure  6.  Example  of  tree  diagram  currently  familiar  to 
biology  students 


Table  1.  User  comments  to  open-ended  questions.  Responses 
given  by  fewer  than  three  user  teams  are  not  included. 

DISCUSSION  AND  CONCLUSION 

Our  qualitative  user  study  of  TaxonTree  is  the  first  to 
describe  usage  patterns  and  issues  in  the  biodiversity 
domain.  TaxonTree  provides  the  value  of  interactive  tree 
visualization  with  integrated  searching  and  browsing  in  the 
biodiversity  domain. 

Characterizing  Biodiversity  Users 

We  recognize  that  our  self-selected  user  sample  is  not 
representative  of  the  public  at  large,  but  we  can  begin  to 
characterize  a  subset  of  biodiversity  information  seekers  - 
learners.  Interest  in  biodiversity  was  female-biased.  Even 
interested  biology  majors  primarily  seek  information  using 
common  names.  They  often  look  for  names  at  levels  above 
the  species  level,  and  have  some  prior  knowledge  about 
what  they  are  looking  for.  Basic  natural  history  is  of 
interest,  as  well  as  particular  characteristics  distinguishing 
an  animal  from  others.  Though  drawn  from  a  course 
focusing  on  evolutionary  relationships,  only  two  user  teams 
mentioned  a  special  interest  in  them.  This  suggests  a 
general  lack  of  appreciation  for  the  value  of  these 
relationships  in  understanding  biodiversity.  Our  users 
expressed  little  concern  over  practical  information  such  as 
which  animals  are  dangerous,  or  how  to  care  for  pets. 

Role  of  Content  Knowledge/Interest 

It  is  difficult  to  separate  effects  of  content  interest  from 
content  knowledge  but  we  think  this  is  a  promising  area  for 
future  research.  In  our  study,  high  interest  users  often  used 
prior  knowledge  to  solve  tasks.  Was  this  because  they  had 
more  knowledge  (a  consequence  of  their  content  interest)  or 
were  they  more  willing  to  use  it? 


Interactive  Tree  Visualization 

A  style  of  tree  diagram  (shown  in  Figure  6)  that  biology 
students  are  currently  familiar  with  is  very  different  from 
TaxonTree’s  style  in  the  following  ways:  1)  it  shows  names 
only  at  leaves;  2)  internal  nodes  are  labeled  with  brackets 
outside  of  the  tree;  3)  every  branch  has  a  fixed  angle. 
Despite  these  differences,  users  easily  understood 
TaxonTree’s  tree  structure.  We  conclude  that  our 
combination  of  interaction  style  and  tree  representation  is 
very  intuitive.  It  could  therefore  be  useful  in  many  other 
domains,  but  a  closer  look  at  the  trade-offs  of  the  different 
representations  is  warranted. 

One  possible  explanation  for  why  users  prefer  to  browse  the 
tree  rather  than  search  it  is  that  they  enjoyed  the  animated 
interaction.  If  this  is  true,  making  interaction  fun  may  be 
another  benefit  of  animation. 

It  is  beneficial  to  present  search  results  in  an  interactive 
classification  tree  that  shows  biological  context.  Users 
easily  interpreted  search  results,  quickly  using  the  tree 
structure  to  discover  the  quantity  of  biologically  unique 
results.  They  often  made  more  sophisticated  inferences 
about  relationships  among  the  search  results,  which  is 
nearly  impossible  by  using  a  typical  list  of  search  results. 
They  also  carried  out  additional  browsing,  giving  them  a 
better  sense  of  the  search  result  by  further  browsing. 
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